Skip to content

Conversation

@can-anyscale
Copy link
Contributor

@can-anyscale can-anyscale commented Jun 11, 2025

This PR updates the opentelemetry-sdk version requirement to a minimum of 1.30, which is necessary to support histogram metrics as part of our migration from OpenCensus to OpenTelemetry.

As part of this upgrade, I am also removing opentelemetry-exporter-otlp from the ray[all] dependency set. This package has been notoriously difficult to resolve alongside other dependencies, and this issue becomes even bigger with the updated SDK version. Since it is only used for the Ray tracing feature, I propose making it an optional, user-controlled dependency instead of a default one.

Additionally, the test environment has been updated to ensure test_tracing.py continues to pass, verifying that tracing functionality remains intact.

Test:

  • CI

@can-anyscale can-anyscale force-pushed the can-otelup branch 2 times, most recently from aef0d64 to 3109971 Compare June 11, 2025 20:34
@can-anyscale can-anyscale added the go add ONLY when ready to merge, run all tests label Jun 11, 2025
@can-anyscale can-anyscale changed the title [core] remove otel from hard dependencies [core] upgrade opentelemetry-sdk Jun 11, 2025
python/setup.py Outdated
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no longer needed since it's a part of default

@can-anyscale can-anyscale force-pushed the can-otelup branch 2 times, most recently from c8c748e to 1ec2518 Compare June 11, 2025 21:17
@can-anyscale
Copy link
Contributor Author

can-anyscale commented Jun 11, 2025

Also pending upgrade to the next release of vllm, which also removed opentelemetry-exporter-otlp from its requirement

---amended by @lk-chen ---
vllm 0.9.2 release contains vllm-project/vllm#19378 which removes opentelemetry from requirements/common.txt

@can-anyscale can-anyscale marked this pull request as ready for review June 11, 2025 22:04
@can-anyscale can-anyscale requested a review from abrarsheikh June 11, 2025 22:04
@can-anyscale can-anyscale changed the title [core] upgrade opentelemetry-sdk [core][rfc] upgrade opentelemetry-sdk Jun 11, 2025
@edoakes
Copy link
Collaborator

edoakes commented Jun 11, 2025

how recent is 1.30?

@can-anyscale
Copy link
Contributor Author

i think fairly recent, 1.30 is feb 4, 2025; the current version installed in ray image is 1.26 was 1 year ago

@edoakes
Copy link
Collaborator

edoakes commented Jun 11, 2025

I'm a little concerned that such a recent version will cause dependency issues for our users. Do you have a sense of how easy it is to upgrade OT versions? Does it have many pinned transitive dependencies and/or breaking changes?

Comment on lines 39 to 43
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then this will break core telemetry? how do we make this stable?

I thought the plan is to drop the requirement on opentelemetry-exporter-otlp why is it still required here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @elliot-barn , our new dependency tzar (prospective)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discuss offline; will add comments about which packages are not currently resolved with opentelemetry-exporter-otlp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#updated

@can-anyscale
Copy link
Contributor Author

can-anyscale commented Jun 11, 2025

@edoakes your concern is completely valid, which is why I opened this as an RFC to gather feedback. Regarding your questions — I believe the upgrade should only be a problem for certain niche workloads. Here's why:

Let’s assume a package is painful to upgrade primarily when it sits deep in the dependency chain — because updating it triggers a cascade of upgrades. Fortunately, opentelemetry-sdk/api sits at the top of the dependency chain in the Python ecosystem. That means not many packages depend on it directly. This is reflected in our requirements_compiled.txt, which includes many packages — but only a couple directly depend on opentelemetry-sdk/api:

  • deprecated (link)
  • importlib-metadata (link)
  • and zipp, which depends on importlib-metadata (link) — but that’s effectively the end of the chain.

Now, there are problematic package such as opentelemetry-exporter-otlp, which depends on opentelemetry-sdk/api and is known to be tricky to upgrade — particularly for users with tracing workloads, as it pulls in protobuf 5, which can introduce its own issues.

On the other side, upgrading opentelemetry-sdk/api itself is also easy — it has a minimal set of dependencies (reference).

@can-anyscale can-anyscale force-pushed the can-otelup branch 2 times, most recently from 4c17235 to 22db8a8 Compare June 12, 2025 00:07
Copy link
Collaborator

@edoakes edoakes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on your evaluation I'm good with this change to unblock the OT migration.

Let's clearly document it in release notes with a call to action to open a GH issue if it causes any problems for users.

@can-anyscale can-anyscale changed the base branch from master to can-tel10leg June 18, 2025 17:41
@can-anyscale can-anyscale changed the title [core][rfc] upgrade opentelemetry-sdk [core] upgrade opentelemetry-sdk Jun 18, 2025
@can-anyscale can-anyscale force-pushed the can-tel10leg branch 4 times, most recently from 7b02c0a to 69c868d Compare June 19, 2025 18:51
@can-anyscale can-anyscale requested review from a team July 9, 2025 18:14
@can-anyscale can-anyscale force-pushed the can-otelup branch 2 times, most recently from 27f2f37 to c2e39a6 Compare July 9, 2025 18:29
@can-anyscale can-anyscale changed the base branch from can-tel10leg to master July 9, 2025 18:29
@can-anyscale can-anyscale removed request for a team and abrarsheikh July 9, 2025 18:30
@can-anyscale can-anyscale requested a review from aslonnie July 9, 2025 20:01
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC: @aslonnie, skipping this test as per offline discussion

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

context: the test is using an old, released wheel file but with current requirement_compiled.txt file.

REEf team will fix/refactor this test as a follow up

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you drop -U and add a version for opentelemetry-exporter-otlp ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if this should be kept as the same version of opentelemetry-sdk and opentelemetry-api

Signed-off-by: can <can@anyscale.com>
# Install tracing dependencies if requested. Intentionally, we do not use
# requirements_compiled.txt as the constraint file. They are not compatible with
# a few packages in that file (e.g. requiring an ugprade to protobuf 5+).
pip install opentelemetry-exporter-otlp==1.34.1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pinned opentelemetry-exporter-otlp

@can-anyscale can-anyscale enabled auto-merge (squash) July 9, 2025 22:47
@can-anyscale can-anyscale merged commit a8e49fe into master Jul 10, 2025
6 checks passed
@can-anyscale can-anyscale deleted the can-otelup branch July 10, 2025 00:16
jugalshah291 pushed a commit to jugalshah291/ray_fork that referenced this pull request Sep 11, 2025
This PR updates the `opentelemetry-sdk` version requirement to a minimum
of 1.30, which is necessary to support histogram metrics as part of our
migration from OpenCensus to OpenTelemetry.

As part of this upgrade, I am also removing
`opentelemetry-exporter-otlp` from the `ray[all]` dependency set. This
package has been notoriously difficult to resolve alongside other
dependencies, and this issue becomes even bigger with the updated SDK
version. Since it is only used for the Ray tracing feature, I propose
making it an optional, user-controlled dependency instead of a default
one.

Additionally, the test environment has been updated to ensure
`test_tracing.py` continues to pass, verifying that tracing
functionality remains intact.

Test:
- CI

Signed-off-by: can <can@anyscale.com>
Signed-off-by: jugalshah291 <shah.jugal291@gmail.com>
dstrodtman pushed a commit to dstrodtman/ray that referenced this pull request Oct 6, 2025
This PR updates the `opentelemetry-sdk` version requirement to a minimum
of 1.30, which is necessary to support histogram metrics as part of our
migration from OpenCensus to OpenTelemetry.

As part of this upgrade, I am also removing
`opentelemetry-exporter-otlp` from the `ray[all]` dependency set. This
package has been notoriously difficult to resolve alongside other
dependencies, and this issue becomes even bigger with the updated SDK
version. Since it is only used for the Ray tracing feature, I propose
making it an optional, user-controlled dependency instead of a default
one.

Additionally, the test environment has been updated to ensure
`test_tracing.py` continues to pass, verifying that tracing
functionality remains intact.

Test:
- CI

Signed-off-by: can <can@anyscale.com>
Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants